Extending Word Highlighting in Multiparticipant Chat

نویسندگان

  • David C. Uthus
  • David W. Aha
چکیده

We describe initial work on extensions to word highlighting for multiparticipant chat to aid users in finding messages of interest, especially during times of high traffic in chat rooms. We have annotated a corpus of chat messages from a technical chat domain (Ubuntu’s technical support), indicating whether they are related to Ubuntu’s new desktop environment Unity. We also created an unsupervised learning algorithm, in which relations are represented with a graph, and applied this to find words related to Unity so they can be highlighted in new, unseen chat messages. On the task of finding relevant messages, our approach outperformed two baseline approaches that are similar to current state-of-the-art word highlighting methods in chat clients.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Ubuntu Chat Corpus for Multiparticipant Chat Analysis

We present the Ubuntu Chat Corpus as a data source for multiparticipant chat analysis. This addresses the problem of the lack of a large, publicly suitable corpora for research in this medium. The advantages of using this corpus for research is its large number of chat messages, its multiple languages, its technical nature, and all of the original chat messages are in the public domain.

متن کامل

Multiparticipant chat analysis: A survey

a r t i c l e i n f o a b s t r a c t We survey research on the analysis of multiparticipant chat. Multiple research and applied communities (e.g., AI, educational, law enforcement, military) have interest in this topic. After introducing some context, we describe relevant problems and how these have been addressed using AI techniques. We also identify recent research trends and unresolved issu...

متن کامل

Detecting Bot-Answerable Questions in Ubuntu Chat

Ubuntu’s Internet Relay Chat technical support channel has bots that output specific messages in response to command words from other channel users. These messages can be used to answer frequently-asked questions instead of requiring an expert to (repeatedly) type a lengthy reply. We describe an approach to automatically distinguish bot-answerable questions, which would mitigate this problem. T...

متن کامل

Chat Disentanglement: Identifying Semantic Reply Relationships with Random Forests and Recurrent Neural Networks

Thread disentanglement is a precursor to any high-level analysis of multiparticipant chats. Existing research approaches the problem by calculating the likelihood of two messages belonging in the same thread. Our approach leverages a newly annotated dataset to identify reply relationships. Furthermore, we explore the usage of an RNN, along with large quantities of unlabeled data, to learn seman...

متن کامل

Bertrand’s Paradox Revisited: More Lessons about that Ambiguous Word, Random

The Bertrand paradox question is: “Consider a unit-radius circle for which the length of a side of an inscribed equilateral triangle equals 3 . Determine the probability that the length of a ‘random’ chord of a unit-radius circle has length greater than 3 .” Bertrand derived three different ‘correct’ answers, the correctness depending on interpretation of the word, random. Here we employ geomet...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013